{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Operations on Sequences" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "-" }, "tags": [ "remove-cell" ] }, "source": [ "**CS1302 Introduction to Computer Programming**\n", "___" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2020-11-27T11:20:04.656873Z", "start_time": "2020-11-27T11:20:04.651575Z" }, "slideshow": { "slide_type": "fragment" }, "tags": [ "remove-cell" ] }, "outputs": [], "source": [ "import random\n", "%reload_ext mytutor" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Mutating a list" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "For list (but not tuple), subscription and slicing can also be used as the target of an assignment operation to mutate the list." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:54:20.241689Z", "start_time": "2020-11-02T23:54:20.234547Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "b = [*range(10)] # aliasing\n", "b[::2] = b[:5]\n", "b[0:1] = b[:5]\n", "b[::2] = b[:5] # fails" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Last assignment fails because `[::2]` with step size not equal to `1` is an *extended slice*, which can only be assigned to a list of equal size." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**What is the difference between mutation and aliasing?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "In the previous code:\n", "- The first assignment `b = [*range(10)]` is aliasing, which gives the list the target name/identifier `b`.\n", "- Other assignments such as `b[::2] = b[:5]` are mutations that [calls `__setitem__`](https://docs.python.org/3/reference/simple_stmts.html#assignment-statements) because the target `b[::2]` is not an identifier." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Explain the outcome of the following checks of equivalence?" ] }, { "cell_type": "code", "execution_count": 166, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:14:46.076040Z", "start_time": "2020-11-03T00:14:46.068986Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 400\n", "a = [10, 20, 30, 40]\n", "b = a\n", "print('a is b? {}'.format(a is b))\n", "print('{} == {}? {}'.format(a, b, a == b))\n", "b[1:3] = b[2:0:-1]\n", "print('{} == {}? {}'.format(a, b, a == b))" ] }, { "cell_type": "markdown", "metadata": { "nbgrader": { "grade": true, "grade_id": "equivalence", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "source": [ "- `a is b` and `a == b` returns `True` because the assignment `b = a` makes `b` an alias of the same object `a` points to.\n", "- In particular, the operation`b[1:3] = b[2:0:-1]` affects the same list `a` points to." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Why mutate a list?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The following is another implementation of `composite_sequence` that takes advantage of the mutability of list. " ] }, { "cell_type": "code", "execution_count": 167, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:16:59.245275Z", "start_time": "2020-11-03T00:16:59.228466Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4 6 8 9 10 12 14 15 16 18 20 21 22 24 25 26 27 28 30 32 33 34 35 36 38 39 40 42 44 45 46 48 49 50 51 52 54 55 56 57 58 60 62 63 64 65 66 68 69 70 72 74 75 76 77 78 80 81 82 84 85 86 87 88 90 91 92 93 94 95 96 98 99 " ] } ], "source": [ "def sieve_composite_sequence(stop):\n", " is_composite = [False] * stop # initialization\n", " for factor in range(2,stop):\n", " if is_composite[factor]: continue\n", " for multiple in range(factor*2,stop,factor):\n", " is_composite[multiple] = True\n", " return (x for x in range(4,stop) if is_composite[x])\n", "\n", "for x in sieve_composite_sequence(100): print(x, end=' ')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The algorithm \n", "1. changes `is_composite[x]` from `False` to `True` if `x` is a multiple of a smaller number `factor`, and\n", "2. returns a generator that generates composite numbers according to `is_composite`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Exercise** Is `sieve_composite_sequence` more efficient than your solution `composite_sequence`? Why?" ] }, { "cell_type": "code", "execution_count": 168, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:17:51.924382Z", "start_time": "2020-11-03T00:17:51.915347Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "ename": "NameError", "evalue": "name 'composite_sequence' is not defined", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfor\u001b[0m \u001b[0mx\u001b[0m \u001b[0;32min\u001b[0m \u001b[0mcomposite_sequence\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m10000\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;32mpass\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mNameError\u001b[0m: name 'composite_sequence' is not defined" ] } ], "source": [ "for x in composite_sequence(10000): pass" ] }, { "cell_type": "code", "execution_count": 169, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:18:05.062949Z", "start_time": "2020-11-03T00:18:04.758546Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [], "source": [ "for x in sieve_composite_sequence(1000000): pass" ] }, { "cell_type": "markdown", "metadata": { "ExecuteTime": { "end_time": "2020-10-29T05:38:20.563990Z", "start_time": "2020-10-29T05:38:20.159556Z" }, "nbgrader": { "grade": true, "grade_id": "sieve", "locked": false, "points": 0, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "source": [ "The line `if is_composite[factor]: continue` avoids the redundant computations of checking composite factors." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "**Exercise** Note that the multiplication operation `*` is the most efficient way to [initialize a 1D list with a specified size](https://www.geeksforgeeks.org/python-which-is-faster-to-initialize-lists/), but we should not use it to initialize a 2D list. Fix the following code so that `a` becomes `[[1, 0], [0, 1]]`." ] }, { "cell_type": "code", "execution_count": 170, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:18:50.495330Z", "start_time": "2020-11-03T00:18:50.490870Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 250\n", "a = [[0] * 2] * 2\n", "a[0][0] = a[1][1] = 1\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 171, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:19:11.700117Z", "start_time": "2020-11-03T00:19:11.693468Z" }, "code_folding": [], "nbgrader": { "grade": false, "grade_id": "init-2D", "locked": false, "schema_version": 3, "solution": true, "task": false }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1, 0], [0, 1]]\n" ] } ], "source": [ "### BEGIN SOLUTION\n", "a = [[0] * 2 for i in range(2)]\n", "### END SOLUTION\n", "a[0][0] = a[1][1] = 1\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Different methods to operate on a sequence" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Recall the `quicksort` algorithm:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2020-11-02T23:52:14.940108Z", "start_time": "2020-11-02T23:52:14.928659Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[45, 11, 74, 61, 91, 22, 24, 9, 64, 56]\n", "[9, 11, 22, 24, 45, 56, 61, 64, 74, 91]\n" ] } ], "source": [ "def quicksort(seq):\n", " '''Return a sorted list of items from seq.'''\n", " if len(seq) <= 1:\n", " return list(seq)\n", " i = random.randint(0, len(seq) - 1)\n", " pivot, others = seq[i], [*seq[:i], *seq[i + 1:]]\n", " left = quicksort([x for x in others if x < pivot])\n", " right = quicksort([x for x in others if x >= pivot])\n", " return [*left, pivot, *right]\n", "\n", "\n", "seq = [random.randint(0, 99) for i in range(10)]\n", "print(seq, quicksort(seq), sep='\\n')" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "There is also a built-in function `sorted` for sorting a sequence:" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2020-10-29T06:26:59.306305Z", "start_time": "2020-10-29T06:26:59.300193Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/plain": [ "[9, 11, 22, 24, 45, 56, 61, 64, 74, 91]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "sorted?\n", "sorted(seq)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Is `quicksort` quicker?**" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "12 µs ± 194 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)\n" ] } ], "source": [ "%%timeit\n", "quicksort(seq)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "218 ns ± 6.26 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)\n" ] } ], "source": [ "%%timeit\n", "sorted(seq)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Python implements the [Timsort](https://en.wikipedia.org/wiki/Timsort) algorithm, which is very efficient." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**What are other operations on sequences?**" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "The following compares the lists of public attributes for `tuple` and `list`. \n", "- We determine membership using the [operator `in` or `not in`](https://docs.python.org/3/reference/expressions.html#membership-test-operations).\n", "- Different from the [keyword `in` in a for loop](https://docs.python.org/3/reference/compound_stmts.html#the-for-statement), operator `in` calls the method `__contains__`." ] }, { "cell_type": "code", "execution_count": 172, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:20:02.524339Z", "start_time": "2020-11-03T00:20:02.514256Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Common attributes: count, index\n", "Tuple-specific attributes: \n", "List-specific attributes: append, clear, copy, extend, insert, pop, remove, reverse, sort\n" ] } ], "source": [ "list_attributes = dir(list)\n", "tuple_attributes = dir(tuple)\n", "\n", "print(\n", " 'Common attributes:', ', '.join([\n", " attr for attr in list_attributes\n", " if attr in tuple_attributes and attr[0] != '_'\n", " ]))\n", "\n", "print(\n", " 'Tuple-specific attributes:', ', '.join([\n", " attr for attr in tuple_attributes\n", " if attr not in list_attributes and attr[0] != '_'\n", " ]))\n", "\n", "print(\n", " 'List-specific attributes:', ', '.join([\n", " attr for attr in list_attributes\n", " if attr not in tuple_attributes and attr[0] != '_'\n", " ]))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- There are no public tuple-specific attributes, and\n", "- all the list-specific attributes are methods that mutate the list, except `copy`." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The common attributes\n", "- `count` method returns the number of occurrences of a value in a tuple/list, and\n", "- `index` method returns the index of the first occurrence of a value in a tuple/list." ] }, { "cell_type": "code", "execution_count": 173, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:20:48.580635Z", "start_time": "2020-11-03T00:20:48.574007Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "a = (1,2,2,4,5)\n", "print(a.index(2))\n", "print(a.count(2))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`reverse` method reverses the list instead of returning a reversed list." ] }, { "cell_type": "code", "execution_count": 177, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:22:55.864740Z", "start_time": "2020-11-03T00:22:55.852504Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "a = [*range(10)]\n", "print(reversed(a))\n", "print(*reversed(a))\n", "print(a.reverse())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- `copy` method returns a copy of a list. \n", "- `tuple` does not have the `copy` method but it is easy to create a copy by slicing." ] }, { "cell_type": "code", "execution_count": 178, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:23:31.853407Z", "start_time": "2020-11-03T00:23:31.849634Z" }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 400\n", "a = [*range(10)]\n", "b = tuple(a)\n", "a_reversed = a.copy()\n", "a_reversed.reverse()\n", "b_reversed = b[::-1]" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "`sort` method sorts the list *in place* instead of returning a sorted list." ] }, { "cell_type": "code", "execution_count": 179, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:24:40.980746Z", "start_time": "2020-11-03T00:24:40.974298Z" }, "scrolled": true, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "import random\n", "a = [random.randint(0,10) for i in range(10)]\n", "print(sorted(a))\n", "print(a.sort())" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- `extend` method that extends a list instead of creating a new concatenated list.\n", "- `append` method adds an object to the end of a list.\n", "- `insert` method insert an object to a specified location." ] }, { "cell_type": "code", "execution_count": 180, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:25:45.278346Z", "start_time": "2020-11-03T00:25:45.274286Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "a = b = [*range(5)]\n", "print(a + b)\n", "print(a.extend(b))\n", "print(a.append('stop'))\n", "print(a.insert(0,'start'))" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- `pop` method deletes and return the last item of the list. \n", "- `remove` method removes the first occurrence of a value in the list. \n", "- `clear` method clears the entire list.\n", "\n", "We can also use the function `del` to delete a selection of a list." ] }, { "cell_type": "code", "execution_count": 182, "metadata": { "ExecuteTime": { "end_time": "2020-11-03T00:27:15.536305Z", "start_time": "2020-11-03T00:27:15.529557Z" }, "slideshow": { "slide_type": "-" } }, "outputs": [ { "data": { "text/html": [ "\n", " \n", " " ], "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "%%mytutor -h 300\n", "a = [*range(10)]\n", "del a[::2]\n", "print(a.pop())\n", "print(a.remove(5))\n", "print(a.clear())" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.8" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": false, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "rise": { "enable_chalkboard": true, "scroll": true, "theme": "white" }, "toc": { "base_numbering": 1, "nav_menu": { "height": "195px", "width": "330px" }, "number_sections": true, "sideBar": true, "skip_h1_title": true, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "454.418px", "left": "1533px", "top": "110.284px", "width": "260.994px" }, "toc_section_display": true, "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 4 }